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ABSTRACT 

The Human microRNA Disease Database (HMDD; 
available via the Web site at http://cmbi.bjmu.edu. 
cn/hmdd and http://202.38.126.151/hmdd/tools/ 
hmdd2.html) is a collection of experimentally sup- 
ported human microRNA (miRNA) and disease asso- 
ciations. Here, we describe the HMDD v2.0 update 
that presented several novel options for users to fa- 
cilitate exploration of the data in the database. In the 
updated database, miRNA-disease association data 
were annotated in more details. For example, 
miRNA-disease association data from genetics, epi- 
genetics, circulating miRNAs and miRNA-target 
interactions were integrated into the database. In 
addition, HMDD v2.0 presented more data that 
were generated based on concepts derived from 
the miRNA-disease association data, including 
disease spectrum width of miRNAs and miRNA 
spectrum width of human diseases. Moreover, we 
provided users a link to download all the data in 
the HMDD v2.0 and a link to submit novel data into 
the database. Meanwhile, we also maintained the 
old version of HMDD. By keeping data sets up-to- 
date, HMDD should continue to serve as a valuable 
resource for investigating the roles of miRNAs in 
human disease. 

INTRODUCTION 

MicroRNAs (miRNAs) are one class of important small 
noncoding RNA molecules that mainly repress gene 



expression at the posttranscription level (1). Generally, 
one miRNA could regulate hundreds of target genes, 
and one gene could also be regulated by hundreds of 
miRNAs (2). So far, ~2000 miRNAs have been identified 
in the human genome according to the miRBase database 
release 20 (3). Recently, increasing studies have shown 
that miRNAs play critical roles in many important biolo- 
gical processes (4). Therefore, miRNA-related dysfunc- 
tions could be associated with a broad spectrum of 
diseases, including cancer (5) and cardiovascular diseases 
(6). Clearly, miRNAs have been becoming one novel class 
of potential biomarkers or targets for disease diagnosis 
and therapy (7). 

To date, a number of miRNA-related databases have 
been developed. These databases have shown their great 
helps in providing valuable miRNA-related information 
such as sequences (3), experimentally supported miRNA 
targets (8-10), mutations (1 1-13), experimentally supported 
miRNA transcription factors (14,15), miRNA-drug inter- 
actions (16,17) and miRNA-associated diseases (18-21). 
For example, miR2Disease is a database for experimentally 
miRNA-disease associations and miRNA-target inter- 
actions. miRTarBase is a database for experimentally 
validated miRNA-target interactions. Both databases 
have provided great help in miRNA-related studies. 

To our knowledge, the Human microRNA Disease 
Database (HMDD), which was released in December 
2007 and had been updated ~30 times during the past 5 
years, is one of the first databases for miRNA-associated 
diseases (18). Here, we introduced the HMDD v2.0 that 
collected > 10 000 experimentally supported miRNA- 
disease association entries, including ~600 miRNA genes 
and ~400 human diseases from >3000 articles. We also 
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annotated specific classes of entries, for example, entries 
whose experimental evidence is from genetics, epigenetics, 
circulating miRNAs and miRNA-target interactions. In 
addition, an analysis for the tendency of miRNA- 
disease investigations was performed. Finally, we 
summarized the usage of data sets in HMDD v2.0. 

SYSTEM OVERVIEW 

The aim of HMDD v2.0 is to provide a web interface for 
users to browse, search and download data sets in the 
database, and submit novel data into the database. To 
collect the experimentally supported miRNA-disease as- 
sociations, we firstly obtained all miRNA-related publica- 
tions from the PubMed database using the keywords 
'microRNA', 'miRNA' or 'miR'. Then, we manually 
retrieved entries related with miRNA-disease associ- 
ations. Every entry contains four items, which are 
miRNA name, disease name, experimental evidence for 
the miRNA-disease association and the publication 
PubMed ID. The miRNA name and the disease name 
are normalized. We further annotate the data in more 
details including entries whose experimental evidence is 
from genetics, epigenetics and miRNA-target interaction. 
More recently, increasing studies have revealed that a 
number of miRNAs stably exist in circulation systems 
and could be biomarkers for disease diagnosis and treat- 
ment (22). Therefore, we also annotated miRNA-disease 
entries whose evidence is from circulating miRNAs in 
HMDD v2.0. In addition, we integrated disease-related 
miRNA-target interactions from miR2Disease, 
miRTarBase and TarBase. As a result, HMDD collected 
10 368 entries that include 572 miRNA genes, 378 diseases 
from 351 1 articles. 

In the 'HMDD v2.0' database, all data had been 
organized using SQLite, a lightweight database manage- 
ment system. The Web site was developed based on 
Django, a Python web framework. The database is avail- 
able at http://cmbi.bjmu.edu.cn/hmdd and http://202.38. 
126.151 /hmdd/tools/hmdd2.html. 

QUERYING THE DATABASE 

We provided users several ways to query the HMDD v2.0 
database. First, users can browse the HMDD v2.0 by 
miRNA names or disease names. When clicking one 
miRNA or disease in the 'Browse' page, HMDD v2.0 
will return a list of matched entries. Second, we 
provided a 'fuzzy search' function for the entries by the 
full or partial names of miRNAs or diseases in the 'Search' 
page. The 'Search' is case-insensitive. Moreover, all data 
in the database can be freely downloaded. The users can 
also submit novel data into the database. In addition, a 
detailed tutorial for the usage of the database is available 
in the 'Help' page. 

THE USAGE OF THE DATA SETS IN HMDD v2.0 

Besides general database search, browse and download as 
introduced above, users can perform their specific 



researches based on the datasets in HMDD. Here, we 
introduced a new analysis of HMDD data sets and 
summarized previous HMDD-based researches. 

History tendency of miRNA-disease relationship 
investigations 

By analyzing the publication time distribution of the total 
entries, entries from genetics, entries from epigenetics, 
entries from circulating miRNAs and entries from 
miRNA-target interactions, respectively, we showed that 
the total publications about miRNA-disease relationships 
increase dramatically (Figure 1A). This also suggests that 
the investigation of miRNAs in the pathogenesis of 
human disease is continually to be one of the hottest 
fields in biomedical research. However, the above four 
types of specific entries show different patterns. Entries 
from genetics (Figure IB) increased dramatically from 
2002 to 2007, and then increased slowly after 2007. In 
contrast, entries from epigenetics (Figure 1C), circulating 
miRNAs (Figure ID) and miRNA-target interactions 
(Figure IE) increased dramatically in recent years, sug- 
gesting that they are hot topics in the current miRNA- 
disease study. Especially for circulating miRNAs, there is 
no entry for circulating miRNA and disease relationship 
before 2009, but their entries increased dramatically after 
2009, suggesting that establishment of circulating 
miRNAs as biomarkers for diagnosis and treatment of 
diseases has been becoming a hot topic in miRNA 
research. 

Predicting miRNA-associated functions, disease and 
environmental factors 

The miRNA-disease association data in HMDD can also 
be used to predict novel miRNA-associated diseases and 
functions. The hypothesis is that miRNAs with similar 
functions tend to be associated with diseases with similar 
phenotypes, and vice versa (18). Using HMDD miRNA- 
disease association data, we had previously developed a 
graph-based method to evaluate the functional similarity 
of miRNAs and then to infer novel miRNA-associated 
disease (23). Based on the miRNA functional similarity, 
two other labs developed network-based methods to 
predict novel miRNA-associated diseases (24,25). In 
addition, it is also possible to predict the relationship 
between environmental factors and diseases by calculating 
the similarity of their miRNA signatures in HMDD 
v2.0 (26). 

miRNA set enrichment analysis 

A miRNA set is defined as a group of miRNAs that have 
some specific features (18). According to this rule, 
miRNAs that are associated with the same disease, for 
example, breast cancer, will be presented as one miRNA 
set. As a result, HMDD can generate ~400 miRNA sets. 
Using the miRNA enrichment analysis tool, TAM (27), it 
is easy to investigate enriched miRNA sets for a given list 
of miRNAs, for example, the upregulated miRNAs from a 
microarray experiment. This analysis makes it possible not 
only for finding patterns or rules behind a set of miRNAs 
but also for predicting novel miRNA-disease associations. 
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Figure 1. The historically cumulative number of total entries (A), entries from genetics (B), entries from epigenetics (C), entries from circulating 
miRNAs (D) and entries from miRNA-target interaction (E). 



Disease spectrum width of a miRNA and miRNA 
spectrum width of a disease 

The concept of disease spectrum width (DSW) of a 
miRNA was originally proposed by us in a previous 
study (26). For one miRNA i, DSW(i) = n(i)/N where, 
n(i) is the number of diseases associated with miRNA i, 
N is the total number of diseases that have been reported 
to be associated with miRNAs (26). We had previously 
shown that DSW of one miRNA could be a metric to 
evaluate its importance in function and human disease 
(26). For example, miR-21 has the biggest DSW (0.33), 
suggesting that it has a wide disease spectrum and plays an 
important role in many diseases. The top 10 miRNAs with 
the biggest DSWs are listed in Figure 2A. All of these 
miRNAs had been widely accepted to have critical 
functions and roles in human diseases. Using the simi- 
lar procedure as described above, we also introduced 
another novel concept and metric for one disease, the 
miRNA spectrum width (MSW) of a disease. This 
metric could be used to evaluate the severity of a given 
disease. For example, all of the top 10 diseases (Figure 2B) 
with the biggest MSWs are among the most lethal human 
diseases. In future, more research cases are needed to 
confirm and consolidate the usefulness of the two 



metrics, and we provided downloadable files for them in 
HMDD v2.0. 



CONCLUSION 

Increasing studies have shown that miRNAs have import- 
ant functions and are associated with the development and 
progression of a broad range of diseases. miRNAs had 
been becoming novel potential molecules for disease diag- 
nosis, treatment and prognosis. In this article, we describe 
an update (HMDD v2.0) of the HMDD. The HMDD 
database integrated experimentally supported miRNA- 
disease association data and further annotated four 
types of miRNA-disease association data. Based on 
miRNA-disease associations, we proposed and calculated 
two miRNA-related metrics, DSW of miRNAs and MSW 
of diseases, and provided downloadable files for them. By 
analyzing the publication time regarding the miRNA- 
disease associations in the past decade included in 
HMDD, we showed that this simple analysis could 
predict the tendency of miRNA-disease relationship in- 
vestigations. For example, entries from circulating 
miRNAs increase most dramatically in recent years. 
According to this result, it is expected that more data 
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Figure 2. The top 10 miRNAs with the biggest DSW (A) and the top 
10 diseases with the biggest MSW (B). 



regarding the associations of circulating miRNAs with 
human diseases will be generated in the coming years. 
This suggests that the identification of circulating 
miRNAs as disease biomarkers is one of the hottest 
topics in miRNA research. In addition, we also provided 
evidence that HMDD data sets are also useful for predict- 
ing miRNA-associated diseases, functions and environ- 
mental factor-disease relationships. The important roles 
of miRNAs in diseases are attracting more biomedical 
researchers. Therefore, it is expected that HMDD 
v2.0 will integrate more experimentally supported 
miRNA-disease associations in the future. Finally, we 
believe that HMDD v2.0 is useful for the studies of 
miRNA-disease associations, and will provide more help 
in this field when it integrates more data and tools in the 
future. 
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